Kernel Partial Least Squares for Speaker Recognition

نویسندگان

  • Balaji Vasan Srinivasan
  • Daniel Garcia-Romero
  • Dmitry N. Zotkin
  • Ramani Duraiswami
چکیده

I-vectors are a concise representation of speaker characteristics. Recent advances in speaker recognition have utilized their ability to capture speaker and channel variability to develop efficient recognition engines. Inter-speaker relationships in the ivector space are non-linear. Accomplishing effective speaker recognition requires a good modeling of these non-linearities and can be cast as a machine learning problem. In this paper, we propose a kernel partial least squares (kernel PLS, or KPLS) framework for modeling speakers in the i-vectors space. The resulting recognition system is tested across several conditions of the NIST SRE 2010 extended core data set and compared against state-of-the-art systems: Joint Factor Analysis (JFA), Probabilistic Linear Discriminant Analysis (PLDA), and Cosine Distance Scoring (CDS) classifiers. Improvements are shown.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A symmetric kernel partial least squares framework for speaker verification

I-vectors are concise representations of speaker characteristics. Recent progress in i-vectors related research has utilized their ability to capture speaker and channel variability to develop efficient automatic speaker verification (ASV) systems. Inter-speaker relationships in the i-vector space are nonlinear. Accomplishing effective speaker verification requires a good modeling of these non-...

متن کامل

Title of dissertation : SCALABLE LEARNING FOR GEOSTATISTICS AND SPEAKER RECOGNITION Balaji Vasan Srinivasan Doctor of Philosophy , 2011

Title of dissertation: SCALABLE LEARNING FOR GEOSTATISTICS AND SPEAKER RECOGNITION Balaji Vasan Srinivasan Doctor of Philosophy, 2011 Thesis directed by: Professor Ramani Duraiswami Department of Computer Science With improved data acquisition methods, the amount of data that is being collected has increased several fold. One of the objectives in data collection is to learn useful underlying pa...

متن کامل

Mitigating Effects of Recording Condition Mismatch in Speaker Recognition Using Partial Least Squares

Speaker recognition systems have been shown to work well when recordings are collected in conditions with relatively limited mismatch. Thus, a significant focus of the current research is techniques for robust system performance when greater variability is present. This study considers a diverse data set with recordings collected in multiple different rooms with different types of microphones. ...

متن کامل

Voice conversion for non-parallel datasets using dynamic kernel partial least squares regression

Voice conversion aims at converting speech from one speaker to sound as if it was spoken by another specific speaker. The most popular voice conversion approach based on Gaussian mixture modeling tends to suffer either from model overfitting or oversmoothing. To overcome the shortcomings of the traditional approach, we recently proposed to use dynamic kernel partial least squares (DKPLS) regres...

متن کامل

Scalable learning for geostatistics and speaker recognition

With improved data acquisition methods, the amount of data that is being collected has increased several fold. One of the objectives in data collection is to learn useful underlying patterns. In order to work with data at this scale, the methods not only need to be effective with the underlying data, but also have to be scalable to handle larger data collections. My research focused on developi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011